Conversation
There was a problem hiding this comment.
Hi Kathan, this is very comprehensive.
Will go over design choices in a meeting, but just so you can get an idea of what I'm thinking so far. Unfortunately this is too large in it's current state for me to actually get through.
I've only reached up to the dataloader part of the training script.
There was a problem hiding this comment.
Fairly certain you can remove this.
There was a problem hiding this comment.
Fairly certain you can also remove this.
There was a problem hiding this comment.
You can definitely remove this.
| test_size: Proportion of data for testing | ||
| stratify: Whether to use stratified splitting | ||
| random_state: Random state for reproducibility | ||
| feature_engineering: Whether to perform automated feature engineering |
There was a problem hiding this comment.
Good that you included the args here!
| self.stratify = stratify | ||
| self.random_state = random_state | ||
|
|
||
| # Initialize DataLoader |
There was a problem hiding this comment.
At this point can you find and replace all "Initialize" with "Initialise" please
| self.random_state = random_state | ||
|
|
||
| # Initialize DataLoader | ||
| self.data_loader = DataLoader( |
There was a problem hiding this comment.
This is very different to our other projects, I think this is in place of the data importer we use elsewhere?
|
|
||
| # Import utility modules | ||
| from .utils.data_utils import FeatureEngineer, FeatureSelector, DataTransformer, DataProfiler | ||
| from .utils.visualise import DataVisualizer |
|
|
||
| class DataLoader: | ||
| """ | ||
| Enhanced class to load, clean, visualize and prepare data for machine learning. |
There was a problem hiding this comment.
"Enhanced" from what?
This docstring is quite vague.
Linked Issue(s)
Summary of changes
Added the structured data template with a stand alone file for data exploration on the exisitng template V2
Reason for changes
Enhancement